AITopics | transformer network

Thelimitations especially become problematic when learning representations on a misspecified graph or a heterogeneous graph that consists of various types of nodes and edges.

artificial intelligence, graph, machine learning, (19 more...)

Neural Information Processing Systems

Country: North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.04)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.48)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.30)

Add feedback

a1140a3d0df1c81e24ae954d935e8926-Paper.pdf

Neural Information Processing SystemsFeb-9-2026, 15:05:07 GMT

baseline, international conference, preln, (14 more...)

Neural Information Processing Systems

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.14)
Africa > Ethiopia > Addis Ababa > Addis Ababa (0.04)
(8 more...)

Genre: Research Report > New Finding (0.93)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

137cc5c0b8a3c932805e3c14812070ca-Paper-Conference.pdf

Neural Information Processing SystemsFeb-8-2026, 05:25:45 GMT

benchmark, experiment, road network, (15 more...)

Neural Information Processing Systems

Country:

North America > United States > California > Alameda County > Berkeley (0.14)
North America > United States > Illinois > Cook County > Chicago (0.04)
North America > United States > New Mexico > Bernalillo County > Albuquerque (0.04)
(7 more...)

Genre: Research Report > Experimental Study (0.93)

Industry:

Transportation > Infrastructure & Services (1.00)
Transportation > Ground > Road (1.00)
Information Technology (0.93)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.93)
Information Technology > Communications > Networks (0.93)
Information Technology > Artificial Intelligence > Representation & Reasoning > Search (0.68)
(2 more...)

Add feedback

SupplementaryMaterial: LearningRepresentations fromAudio-VisualSpatialAlignment

Neural Information Processing SystemsFeb-8-2026, 00:42:58 GMT

These are transformer networks of base dimension 512 and expansion ration 4. In other words,7 the output dimensionality of the linear transformations of parametersWkey,Wqr,Wval,W0 and8 W2 are 512, and that ofW1 is 2048. Models are pre-trained to optimize loss (7) for AVC task or9 (9)forAVTSandAVSAtasks. Asoriginallyproposed,15 lateral connections are implemented with a1 1 convolution that maps all feature maps into a16 128 dimensional space followed by a3 3convolution for increased smoothing. Thus, all pixels for which the state-of-the-art model was less25 than 75% confident were kept unlabeled. These low confidence regions were also ignored while26 computingevaluationmetrics.

artificial intelligence, supplementarymaterial, viewpoint, (14 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning (0.48)

Add feedback

Learning Generative Vision Transformer with Energy-Based Latent Space for Saliency Prediction

Neural Information Processing SystemsDec-24-2025, 09:32:10 GMT

Vision transformer networks have shown superiority in many computer vision tasks. In this paper, we take a step further by proposing a novel generative vision transformer with latent variables following an informative energy-based prior for salient object detection. Both the vision transformer network and the energy-based prior model are jointly trained via Markov chain Monte Carlo-based maximum likelihood estimation, in which the sampling from the intractable posterior and prior distributions of the latent variables are performed by Langevin dynamics. Further, with the generative vision transformer, we can easily obtain a pixel-wise uncertainty map from an image, which indicates the model confidence in predicting saliency from the image. Different from the existing generative models which define the prior distribution of the latent variables as a simple isotropic Gaussian distribution, our model uses an energy-based informative prior which can be more expressive to capture the latent space of the data. We apply the proposed framework to both RGB and RGB-D salient object detection tasks. Extensive experimental results show that our framework can achieve not only accurate saliency predictions but also meaningful uncertainty maps that are consistent with the human perception.

energy-based latent space, learning generative vision transformer, name change, (7 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.60)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.60)

Add feedback

Filters

Collaborating Authors

transformer network

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

0e4f5cc9f4f3f7f1651a6b9f9214e5b1-Paper.pdf

0a87257e5308197df43230edf4ad1dae-Paper.pdf

Perspective Transformer Nets: Learning Single-View 3D Object Reconstruction without 3D Supervision

eef6aecfe050b556c6a48d9c16b15558-Paper-Conference.pdf

Recurrent Transformer Networks for Semantic Correspondence

Graph Transformer Networks

a1140a3d0df1c81e24ae954d935e8926-Paper.pdf

137cc5c0b8a3c932805e3c14812070ca-Paper-Conference.pdf

SupplementaryMaterial: LearningRepresentations fromAudio-VisualSpatialAlignment

Learning Generative Vision Transformer with Energy-Based Latent Space for Saliency Prediction